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SUMMARY 

Is addition to the general 3 '-5' exonuclease domain described by Bemad el ah [Ceil 59 (1989)219-228] significant ami jo 
acid (aa) sequence similarity has been found in the C- terminal portion of 27 DNA-dependent DNA polymerases belonging 
to the two main superfamifoss: (/) Escherichia cqU DNA polymerase I (PolI>-like prokaryotic DNA polymerases, and (//) DN A 
polymerase oe-like prokaryotic and eukaryotic (viral and cellular) DNA polymerases. The six most conserved C-tennir,aJ 
regions, spanning approx. 340 aa, are located In the same linear arrangement and contain highly conserved motifs and critic al 
residues involved in the pclymerizatlon function. According to the three-dimensional model of Pollk (Klenow fjragraemX 
these six conserved regions are located in the proposed polymerization domain, forming the metal and dNTP binding sites 
and the cleft for holding the DNA template. Site-directed mutagenesis in the 029 DNA polymerase supports some of these 
structural predictions. Ther afore, it is likely that a 'KJenow-like core', containing the DNA polymerase and 3'-5 ' exonuclea *e 
activities, has evolved from a common ancestor, giving rise to the present-day prokaryotic and eukaryotic DNA polymerase s. 



INTRODUCTION 

Structural and functiond studies from several labors* 
tones have shown a close relationship between a number of 
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Abbreviation*: aa, amino acid(:;); >4cMNPV, Autogrnpha catyomica 
mononuclear polyhcdrosis virus; t-human, human DNA polymerase a; 
a-yeast, yeast DNA polymerase i; flj(lll), BacWui subtMs DNA poly- 
merase III; BuAdATP, butyUnfltno deaxyadenosine 5' -triphosphate; 
BuPdGTP. butytphenyl deoxyjiuancwine 5' -triphosphate; dNMP, 
deoxymicteoside monophosphate; dNTP, deoxynucleosade triphosphate; 
6*yeast, yeast DNA polymerase A; EBV, Epstein-Barr virus; Be eps, 
E. cofi DNA polymerase III, cpidJon subunii; HCMV, human cyto- 
megalovirus; H5V, herpes simplex virus; nt, nucleotides); PAA. phos- 
phonoaceiic add; Vo\l.E. cait DNA polymerase I; PoTIk, Klenow (large) 
fragment of Poll; £pn, Strtpsococa s pneumoniae; Toq, Thermus aquatints: 
VZV, varicella zoster vims; wt (or WT), wild type. Mutations are indi- 
cated by original aa (in single-letter notation), its position and the 
replacing aa: U., Y390S - Tyr 590 -• Ser. 



DNA-dependent DNA polymerases including both pro- 
karyouc and eukaryotic, cellular and viral, protein -prim *J 
and RNA-primed DNA replicases (Gibbs et al., 1985; 
Larder etal., 1987; Knopf, 1987; Bemad etal., 1987; 
Wong et al M 1988), as well as the Saccharomyces cere visiie 
REV3 DNA polymerase, probably involved in DNA repair 
(Morrison et al., 1989). This group, named a-like DN \ 
polymerases, is characterized by the presence of several 
linearly arranged and conserved regions of aa homology, 
localed at the C-terminal portion of the protein which we^e 
originally proposed to form the catalytic site involved :n 
dNTP binding (Gibbs etal., 1985). In addition, the struc- 
tural homology detected correlates well with similarities m 
sensitivity to inhibitors such as aphidicolin, BuAdATP and 
BuPdGTP (reviewed by Huberman, 1981; Khan eta-, 
1984; 1985; Blanco and Salas, 1986; Bemad et aL, 198H 
Another group of DNA polymerases, also related ]y 
structural and functional similarities, is named E. colt Poll- 
like DNA polymerases. In this case, the similarities in the 
C-terminal portion of several enzymes from this group 
(OIlis et al., 1985b; Lopez et al., 1989; Lawyer et al., 198 
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Leavitt and Ito, 1989) support a three-dimensional struc- 
ture homologous to that o: Pollk (Olhs et al, 1985a). 

Recently, we have shown that these two groups arc inter- 
related by the presence of three highly conserved segments 
which are located in the N -terminal portion of each DNA 
polymerase. Based on site-directed mutagenesis studies 
and on the extrapolation to the x-ray structure of Pollk, we 
have proposed that these three segments form a highly 
conserved three-dimension al domain containing the 3 '-5' 
exonuclease active site (Bemad et al., 1989). On the other 
hand, it is generally accepled that the a -like and Poll-like 
DNA polymerases are not significantly related in the 
aa sequences involved in the polymerase function, located 
in the C-terminal portion of the polypeptide (OUis et al., 
1985b; Earl et al., 1986; Jung et al., 1987; Bemad etaL, 
1987; Spicer et aL, 1988; Wong et al., 1988). However, as 
shown in this paper, s true rural features as the metal- and 
dNTP-binding sites and the regions which form the cleft for 
holding the DNA template are shared by enzymes belong- 
ing to both groups. The catalytic importance of several of 
the most conserved regions of the C-terminal portion, tested 
by site-directed mutagenesis in the <p29 DNA polymerase, 
support some of these structural predictions. Taking into 
account these structural and functional data, the modular 
organizations of enzymatic activities in prokaryotic and 
eukaryotk DNA polymer tses are compared. 



RESULTS AND DISCUSSION 

(a) Identification of conserved aa regions from sets of 
sequences 

Due to the difficulty of obtaining computer-derived mul- 
tiple alignments when applied to proteins bearing little 
homology, we considered four subsets of DNA polymer- 
ases based on previous rettorts showing structural homol- 
ogy in the C-terminal portion of several groups or pairs of 
DNA polymerases. Group A, or Poll-like DNA poly- 
merases, including prokary 3tk enzymes as Poll, ifa(III), Ec 
eps, and those from Spn, Taq, and phages T5, T7 and 
SP02. Group B, or eukaryotic-viral DNA polymerases, 
including those from HSV type 1, HCMV, EBV, VZV, 
j4cMNPV, vaccinia virus, fowlpox virus, and phage T4. 
This latter enzyme, although prokaryotic, has been included 
in this group due to the sen utrvity to several inhibitors such 
as aphidicoHn and BuPdOTP and to the presence, in its 
primary structure, of regions of striking similarity with ani- 
mal virus DNA polymerases, as it was first described by 
Bemad etal. (1987). Group C, or cellular DNA poly- 
merases, including a-hurmui, ot-yeast, ft-yeast, and REV 3 
from yeast. The nomenclature adopted for this group corre- 
sponds to the new one proposed by Burgers et al. (1990). 
Group D, or DNA polymerases from terminal protein- 



containing genomes, the most heterologous, including pro- 
karyotic and eukaryotic enzymes as those from adenovirus 
type 2, plasmids pGKL2 and pGKLl from yeast, plasirid 
pClKl from fungi, S 1 mitochondrial DNA from malx, 
plasmid pAI2 from fungi, phage PRDl, phage M2, a td 
phage #29- Several DNA polymerases belonging to 
groups B, C and D were previously classified as a oc-lilce 
DNA polymerases (Bemad et al., 1987; 1989; Wong et i J., 

1988) . 

After optimal alignment of the sequences corresponding 
to each subset, the multiple alignment of the four submits 
was carried out, proceeding orderly from the most related 
groups to the least (C-B-D-A). Binary alignments were 
carried out by computer analysis using the programs: 
BESTF1T, PRETTY and GAP from the UWGCG (U u- 
versity of Wisconsin Genetics Computer Grotp; 
Devereaux et al., 1984). This progressive ahgnment of the 
four subsets of DNA polymerases allowed us to betutr 
consider: (1) the existence and relative location of discrete 
segments of aa similarity, (2) the variability occurring in me 
conserved regions, and (3) the presence of specific con- 
served residues corresponding to a particular subset. 

N-termiml domain. Using this progressive alignment 
method, the three conserved regions Exol, ExoII aid 
Exolll, previously described in 19 a-like and Poll-lite 
DNA polymerases (Bernad etal., 1989; where original 
references are reviewed), were detected in other recently 
reported sequences of DNA polymerases belonging to me 
four subsets (Fig. 1), as phage T5 DNA porymerf se 
(group A; Leavitt and Ito, 1989\ fowlpox virus DNA poly- 
merase (group B; Binns et al., 1987), 5-yeast (Boulet et ; L, 

1989) and REV3 (Morrison et aL, 1989) DNA polymerases 
(group €}, and pGKL2 (Tommasino et al., 1988), pClXl 
(Oeser and Tudzinsky, 1989), pAI2 (Kempken et al., 19^9) 
and M2 (Matsumoto etal., 1989) DNA polymery ies 
(group D). The inclusion of the eight new sequences of 
DNA polymerases allowed us to improve the alignment 
corresponding to the Exolll segment of T4, PRDl and $29 
DNA polymerases, with respect to that previou sly repon ed 
(Bernad et al., 1989); in addition, the Exolll segment of 
yeast poke was changed based on significant homology with 
human poloc (T.5.-F. Wang, personal communication). 
Therefore, we propose that the critical residues homologous 
to Y497 and D501 from Poll (described in section c) a^e; 
Y320 and D324 (T4 DNA polymerase); Y145 and D 149 
(PRDl DNA polymerase); Y165 and D169 (029 DNA 
polymerase). As shown in Fig. 1, the aa sequence of ihe 
27 DNA polymerases compared contains regions Exil, 
ExoII and Exolll in the same linear arrangement (see a so 
Fig. 5\ supporting the idea that these three regions ;ire 
forming a structurally and functionally conserved thra- 
dimensional domain (Bernad et al., 1989). Table I sherds 
the % similarity (averaged among the different pairs of 
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EXOI 



EXO II 



EXO III 



B 
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T7 


/l/ 


8TO2 
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B«<III) 


/419/ 


SO *p* . 
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Sav- 
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/376/ 


rev 


/347/ 


vzv 


/415/ 


AcHHPV 


/1B9/ 


Vaccinia 






/221/ 


T4 


/182/ 


Gt-hnsuui 


/606/ 


Ce-yuat 


/433/ 




/368/ 


RKV3 


/736/ 


Adano 


/134/ 




/360/ 


pGXLl 


/337/ 


pClXl 


/3BO/ 


81 


/206/ 


pAX2 


/539/ 


PKD1 


/10/ 


H2 


/2/ 


029 


/s/ 




/416/ 
7190/ 
/56/ 
/10/ 
7302/ 
795/ 

7462/ 
7404/ 
7375/ 
/443/ 
/276/ 
/234/ 
/247/ 
/210/ 

/«34/ 
/474/ 
/39S/ 
/764/ 

/271/ 
/52S/ 
/392/ 
/443/ 
/294/ 
/600/ 
/67/ 
/55/ 
/38/ 




74 92/ 
7281/ 
7163/ 
/161/ 
/553/ 
/147/ 

/372/ 
/547/ 
/302/ 
/5S3/ 
/384/ 
/428/ 
/«39/ 
/315/ 

/701/ 
/708/ 
7509/ 
/875/ 

/433/ 
/344/ 
/338/ 
/373/ 
/392/ 
/731/ 
/140/ 
/157/ 
/160/ 




Fig. ]. Highly conserved N-terminal regions within DNA-dcpeodcot DNA polymerases. The different DNA polymerases considered were grouped as 
follows: A, Poll-like DNA polymerases; B. viral a-like DNA polymerases; C, cellular a -like DNA polymerases; D, protein-primed a-likc DIM 
polymerases. DNA polymerases nomenclature is given in tbe text The three highly conserved regions Exol, Exol I and ExoIII, previously reported -by 
Bemad et al. (1989), are indicate 1 Numbers be t w een slashes indicate the aa position relative to the N-tcrramal end of each DNA polymerase. Relev ml 
aa similarity among the different groups b indicated in white letters; other similarities are indicated by grey boxes. The following conservative aa ware 
considered: S and T; A and G; K, R and H; D f E, Q and N; I, L, M, V, Y and F. Stars indicate specially conserved residues and/or Poll residues invohfcd 
in exonucteotytic catalysis (see ucttoa c for details} The location of the ^29 DNA polymer ase/exonuckasc~dcr3cknt mutants D12A, E14A, and D6SA 
(Bernad et al., 1989X and thai or the B. coH DNA polymerase ID (t subonit) exanuclease-deficient mutant dnaQ49 (V960) (Eehob et al H 1983), die 
indicated with dots. 



DNA polymerases) corresr.ionding to the three Exo-regions 
aligned in Fig. 1, for each group of DNA polymerases. 

C-terminal domain. Takng the ExoIII segment as the 
starting point for aligning the C-terminal domain, we found 
22 segments of significant s rrrilarity for group A DNA poly- 
merases, 21 segments for group B, 24 segments for group C, 
and 21 segments for group B. When the four groups were 



considered for multiple alignment it was possible to align 18 
segments present in all enzymes belonging to the fcur 
groups. Fig. 2 shows the six most conserved regions corre- 
sponding to the multiple alignment between Poll-like 
enzymes (group A) and oe-like DNA porymerai^s 
(groups B, C, and D). In addition, these six regions have 1 ne 
greatest homology in each DNA polymerase group. Th^se 



TABLE 1 

Amino add similarity in the mat conserved N-ienninal and C-terminal regions of DNA-depeodent DNA polymerases 



Group* 


N-terminal 






C-terminal 












1 


II 


III 


1 


2a 


2b 


3 


4 


5 


A 


43.3 


54.3 


45.2 


48.7 


41.6 


53.0 


61.5 


510 


6 JB 


B 


53.6 


69,5 


54J 


61.0 


61.0 


53.2 


60.1 


53.6 


61.7 


C 


40.7 


57.8 


29.8 


74.4 


63.0 


52.6 


66.6 


615 


4^.0 


D 


43.2 


64.0 


515 


44.8 


57.5 


411 


46.9 


37.4 


2f\J0 



• The different DNA polymerasn groups (A, B, C and D)„ and the highly conserved regions (I, II, 111, 1, 2a, 2b, 3, 4 and 5) were defined as indicated 
in section ■ and Fig. 1. Numbers indicate the % similarity averaged among the different pairs of DNA polymerases belonging to each group. 
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REGION 1 



REGION 2a 



REGION 2b 




H9) # 
(32 ># 



F|g. Z Highly conserved OterminaL regions within DNA-depcodent DNA polymerases. The different DNA polymerases considered were grooped w 
in Fig. 1. DNA polymerase nomenclature is given in section a. The ah most highly conserved regions 1, 2a, 2b, 3, 4 and 5 are Indicated. Numbers between 
slashes Indicate the aa position Mlative to the N-terminal end of each DNA polymerase. Numbers in parentheses indicate the number of aa residue 
either connecting regions 2a and 2b or from the end of region 3 to the C-termmaJ end (#> Based oa Pottk structural data (Offis et aL, 1985a), tlie 
disordered regions, a-helices (tettsred) and ^-strands (numbered) are indicated. Relevant aa similarity among the different groups is indicated m win* 
letters, whereas that particularly atrraponding to group A (PoO-Hke) and to groups ft, C, and D (ce-like), h indicated in grey and empty boxes, respective} v. 
The conservative aa considered were as indicated in Fig. i . Stars indicate specially conserved residues and/or Foil residues involved in DNA. metal art 
dNTP binding (see section c roc details). The location of HS V DNA polymerase mutations showing altered sensitivity to PAA. acyclovir and aphidicoh' i: 
A719V, S724N, V813M, N8I5S, TE21M, G841S and R842S (Larder etaL, 1987; Knopf, 1987; Oibbs et al, 1988X and that of 029 DNA polyntera* 
mutants Y254F, Y390S and Y390F (described in section eff and Fig; 4) ore indicated by dots in regions 1,2* and 2b. The location of HS V-DNA polymer** 
mutations G885R, D886N, T887K, D888A and G896V (Dorsky and Crumpacker, 1990), e>29 DNA polymerase mutations Y454F, D456Q, T457P aid 
D4SSG (Bemad etaL, 1989), K4S8T and K498R (unpubtiabed results), and that of T4 DNA polymerase mutator mutant taJLftft <G«94S) (HerslmeH 
I973X ire also indicated by dots in regions 3 and 4, 
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regions were numbered as they occurred from the N ter- 
minus to the C terminus, he wever, the presence of a specific 
insertion in region 2 of protein-primed DNA polymerases 
(group D) led us to divide region 2 into regions 2a and 2b. 

Region 1, spanning 30 asi, contains in groups B, C and D 
the highly conserved motif 'D— SLYP* previously described 
The multiple alignment shTws that the Ser residue corre- 
sponding to this motif (marked by a star in Fig. 2) is also 
highly conserved in PolI-B:e DNA polymerases (group A). 
Furthermore, the highly conserved aspartate in a-like DNA 
polymerases is substituted 1*y a highly conserved asparagine 
(marked by a star in Fig. 2) in Poll-like DNA polymerases. 
Region 2a, spanning 27 aa, contains in groups B, C and D 
the highly conserved motif *K — NS-YG' previously de- 
scribed. The multiple alignment shows that the Lys residue 
corresponding to this motif (marked by a star in 1 Fig. 2) is 
also highly conserved in Poll-like DNA polymerases,, 
although in this case the use of small gaps was necessary 
for optimal alignment. Between regions 1 and 2a, an in- 
sertion of about 50 aa, forming a highly conserved region, 
is only present in protein- grimed DNA polymerases (un- 
published resntts). Reg?oit2b, spanning 19 aa residues, 
contains in all four groups a highly conserved Arg preceded 
by an invariant Oly in groups A, B and C (marked by stars 
in Fig. 2); this Gly is substituted by a relatively conserved 
Alain group IK Regions 2a and 2b are connected by a short 
number of aa residues in {roups A, B, and C, whereas in 
group D, a larger number cf aa are forming a specific con- 
served region, previously reported by Bemad et al. (1987). 
Region 3, spanning 22 aa, contains in groups B, C and D 
the highly conserved motif 'YGDTDS', although the Gly 
residue corresponding to this motif strongly varies in 
group D DNA polymerases . The multiple alignment shows 
that the second Asp of this motif aligns with an invariant 
Glu in group A (marked by u star in Fig. 2), whereas the first 
Asp of the 'YGDTDS' motif corresponds to a Gin residue 
in Poll (marked by a star in Fig. 2), Spn and Taq DNA 
polymerases. Region 4, spanning 24 aa, contains in group A 
an invariant Arg (boxed in Fig. 2), and two invariant Lys 
and Tyr residues (indicated by stars in Fig. 2). The multiple 
alignment shows that groups B, C and D share an invariant 
Lys and Tyr residue (boxed in Fig. 2), in addition to another 
highly conserved Lys resid ie (groups B and C; printed in 
white letters in Fig. 2) that aligns with that corresponding 
to group A. It is worth noting the high similarity in this 
region between Poll and both HSV and VZV DNA poly- 
merases. Region 5, spanning 19 aa, contains in group A the 
invariant motif *VHD\ The multiple alignment shows that 
groups B, C and D contain a highly conserved Asp or Gtu 
in the position corresponding to the invariant Asp in 
group A (indicated by a star in Fig. 2). It is worth noting the 
high similarity, in the centra, portion of this region, between 
the EBV DNA polymerase and the Poll-like DNA poly- 
merases. 



The fact that the protein alignment corresponding to 
these evolutionary distant 27 DNA polymerases is cotinrar 
(see also Fig. 5), supports the idea that these six regions i*re 
forming a structurally and functionally conserved throe- 
dimensional domain. Table I shows the % similarity 
(averaged among the different pairs of DNA polymerase) 
corresponding to die six CHerminal regions aligned in 
Fig. X for each group of DNA polymerases. 

(b) I%re*dtoeasional structure prediction 

Pollk, considered as the prototype of multifunctional 
enzyme involved in the repair and replication of DNA 
(Kornberg, 1980) is, to date, the only DNA polymerase 
whose three-dimensional structure is known (Olhs etil, 
1985a). The crystal analysis of Pollk shows that the poly- 
peptide is folded into two distinct structural domains of 
approx. 200 and 400 aa (Fig. 3). The smaller domtm 
(N-terminal) contains the 3'-5' exonuclease activi7, 
whereas the larger domain (C-terminal) contains the poly- 
merization active site. The larger domain contains a cleft, 
about 20-24 A wide and 25*35 A deep, with the appio- 
priale dimensions for holding double-stranded B T» A 
(reviewed by Joyce and Steitz, 1987). By extrapolation to 
the x-ray structure of Pollk, we have recently proposed tr at 
the 3 r -5' exonuclease domain is structurally and furc- 
tion ally conserved among prokaryotxc and eukaryotic DNA 
polymerases; in particular, the three highly conserved 
regions Exol, Exoll and Exoin (light-shaded areas in 
Fig. 3\ forming the 3'-5' exonuclease active site (Bern jd 
etal., 1989). 

On the basis of the homology detected among the C-tir- 
minal portion of Poll and other DNA polymerases belong- 
ing to groups A, B, C and D, the location of the six mcst 
conserved regions (aligned in Fig. 2) was analyzed by extra* 
polation to the three-dimensional structure of Pollk. /*s 
shown in Fig. 3, region 1 lies in a portion not sufficient ;y 
ordered in the crystal (tentatively depicted with dashed tin 6s 
in Fig. 3X which has been proposed to close ofT the fourth 
side of the cleft (Olus etal., 1985a). Regions 2a and 4, 
corresponding to oe-hehces I and O, respectively, form lv*> 
opposite walls of the cleft, whereas regions 3 (0-strand 9 
followed by a-helix L), 2b (l-strands 7 and 8), and 5 
(l-strands 12 and 13) correspond to the five-stranded an i- 
parallel 0-sheet forming the floor of the cleft. The otbsr 
twelve regions of lower sequence similarity also correspond 
to three-dimensional portions having a defined second a 7 
structure; in general, the variable regions among the DNA 
polymerases correspond to random-coiled portions 00 1- 
necting well-ordered regions (unpublished results). Excep- 
tionally, some turns are specially conserved, as those exii t- 
ing between 0-strand 9 and a-helix L (region 3) and between 
0- strands 7 and 8 (region 2b) or 0-strands 12 and 3 
(region 5). In summary, the fact that the six most conserve 
regions of aa sequence similarity are concentrated aroiuid 
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< Polymerization domain > 3*-5'exonuclease domain ->» 

Fig. 3. A Klenow-like general structure for DNA-dependent DNA polymerase** The representation of the three-dimensional structure of Pollk is frcm 
Ollis et al. (1983&X with alight modifications. Regions thai form o-helices are represented by cytiaders (lettered), and those forming 0-ehcet$ by stippbd 
arrow* (numbered); the disordered region spanning aa 569-626 is indicated by dashed Ones. The N-tenntaal portion (residues 330-520} of the cry* «1 
structure contains the 3'-5' exonijcieaie, particularly involving the three highly conserved regions Exol, Exolt and Cxolll (1, II and ID; indicated wfch 
light-ahadcd areas aligned in Fig 1). The Otcnmnal portion (aa 521*928), enntainmg the rjorymerizatlon activity, involves the six highly conserwtd 
C-terminal regions 1, 2a, 2b, 3. 4 and 5 (dark-shaded areas shown in Fig. 1\ Stars mark the location of aa residues involved in the 3'-5' exonucleate 
activity (metal binding and catalys s) and DNA polymerization activity (DNA, metal and cNTP binding and catalysis) of Poll. These positions correspot<d 
to highly conserved aa marked by itars In Figs. 1 and 2. 



the cleft that is proposed to bind DNA, emphasizes its 
functional importance and the conservation of this struc- 
tural feature throughout the evolution of this key enzyme for 
the process of DNA repliciition. 

(c) Functional sjgnltlranre of the conserve** regions 

Hie structural similarity detected among Poll and other 
prokaryotic and eukaryoti: DNA polymerases (Bemad 
et al M 1989 ; this paper) allows us to generalize some aspects 
of the DNA polymerase function which are suggested from 
the unique shape of the tisrtiary structure of the Pollk. 
Another important line of evidence comes from the use of 
specific reagents or modifixl substrates, as weU as from 
she-directed mutagenesis studies of highly conserved or 
putatively important aa residues in some of the related 
DNA polymerases. ! 



(/) N-terminal domain. A structurally independent 
domain, as that of Pollk, contains the 3 '-5' exonuclearc 
active site in other prokaryotic and eukaryotic DNA poly- 
merases. This evolution arily conserved active site is mainly 
formed by the highly conserved regions Exol, Exoll atd 
Exolil (Bemad et al., 1989; Fig. 1). In agreement with tbiB 
hypothesis, Poll, T7 and <p29 DNA polymerase mutant 
proteins containing aa changes in the exanuclease active 
site, made by she-directed mutagenesis, are devoid of exa- 
nuclease activity but retain full polymerase activry 
(Derbyshire etal., 1988; Tabor and Richardson, 198?'; 
Bemad et al., 1989). The dnaQ49 mutation, located at por- 
tion 96 in the Exoll region of the c subunit or the E. ccH 
DNA PolIH holoenzyme (indicated by a dot in Fig. 1 K 
produces a strong mutator phenotype (Horiuchi et aL, 
1978) and a defective 3'-5' exonuclease activity (Echo* 
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et al., 1983). In addition, the N-texminal clustering of T4 
DNA polymerase mutator mutants (Reha-Krantz, 1988; 
1989) was also informative to localize the exonuciease 
domain (Bemad et al., 1 989 )- The more extensive homology 
presented in this paper strengthens the importance of the 
Asp and Ghi (involved in m etal binding in Poll; Derbyshire 
et al, 1988) of region Exol (indicated with stars in Figs. 1 
and 3) in groups A and D, v/bereas only the Glu is invariant 
in groups B and C. Thr 358 of Poll (indicated by a star in 
Fig. 1), which buries the 3'- DHgroiipofthe DNA substrate 
by formation of an H-bood through its backbone amide 
(Freemont et al., 1988) is also highly conserved in groups A 
and D, whereas Leu 361 of Poll (indicated by a star in 
Fig. 1), whose side chain interacts with the base moiety of 
the last nt of the DNA subsitrate (Freemont et aU 1988), is 
conserved in groups A, fl and C. In region ExoII, the 
Asp 424 of Poll (indicated by a star in Figs. 1 and 3), which 
is involved in metal binding (Derbyshire et al., 1988), is 
always preceded by an aromatic residue (Y or F; indicated 
by a star in Fig. 1), and is highly conserved in all four 
groups, although h can be substituted by a Gib residue. 
Interestingly » with the excep tion of RE V3 DNA polymerase, 
all the enzymes belonging to the four groups contain, in the 
ExoII region, an invariant Asn residue (indicated by a star 
in Fig. 1) whose particular r?le in the exonuciease active site 
is presently unknown. In ngion ExoIII, the Tyr 497 of Poll 
(indicated by a star in Figs. 1 and 3\ proposed to be 
involved in exonucleolytic catalysis (Freemont et al„ 1988), 
is almost invariant in the fotr groups, and the Asp 50 * of Poll 
(indicated by a star in Fijjs. 1 and 3), involved in metal 
binding (Derbyshire etaL, 1988) r is highly conserved in 
groups A, B andD, These data do not support the 
aa sequence alignment reported by Matsumoto etal. 
(1989). 

(tf) C-tcrminal. domain. A structurally independent 
domain, as that of the PolDc, is proposed to contain the 
polymerization active site in prokaryotic and eukaryotic 
DNA polymerases. An evohitionarily conserved *DNA 
polymerase core 9 would be brmed by the highly conserved 
regions I, 2a, 2b, 3, 4 and :5 (Fig. 2\ 

(iff) DNA binding. Region 1 is located in a subdomain of 
PoIIk that is disordered in th e electron-density map (Fig. 3). 
It has been suggested that tr is flexible subdomain may bind 
to the DNA, thereby closing off the cleft and allowing the 
protein to surround the bound DNA (Oliis et al. t 1985a). In 
fact, this region correspond:! to the most or one of the most 
flexible portions of the different DNA polymerases com- 
pared in Fig. 2 (unpublished results). It has been proposed 
that the function of this flexible subdomain, containing 
region 1, may be to increase the processivity of the enzyme 
by slowing its rate of dissociation from the DNA relative to 
translocation and further nt addition (OUis et al., 1985a). 
Region 2a, located immediately after the disordered 
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domain, is structured as a long a-helix (I-helix in Pollk; 
Fig. 3) which forms one side of the cleft. The aa sequence 
of Poll contains, in this region, Lys 635 (marked by a star in 
Figs. 2 and 3) which has been directly involved in DNA 
binding and processivity of the polymerization reaction 
(Basu et al. ( 1988). Interestingly, a corresponding Lys 
residue is invariant in both Poll-like and oc-hke DNA poly- 
merases (Fig. 2). In the latter, this Lys forms part of tte 
highly conserved motif 4 K — NS-YG* previously described. 
Therefore, these structural and functional data suggest tl at 
regions 1 and 2a are probably involved in DNA binding 
and processivity, the coupling of synthesis to movement 
along the template. 

To test the above hypothesis, she-directed mutagenesis in 
the highly conserved Tyr residues in regions 1 and 2a of tie 
029 DNA polymerase was carried out These Tyr residues 
were selected for mutagenesis on the basis of the apparent 
similarity of the motifs enclosed in these region*: 
-P-NgLYP- (region 1) and 'K — NSfL/VYYG ' (region 2a). 
Tyr 254 (region 1) was changed into Phe, whereas Tyr 2 * 0 
(region 2a) was changed into Ser or Phe, and the mutant 
proteins were overproduced, purified and assayed for 3'- V 
exonuciease, nonprocessive and processive DNA poly- 
merization, and protein-primed initiation. As shown m 
Fig. 4B, the Y254F and Y390F mutations strongry affecbid 
the processive elongation of the ^29 DNA polymerase 
(Blanco et al., 1989), whereas the Y390S mutation essen- 
tially had no effect. Fig. 4A shows that nonprocessive elon- 
gation was only affected by the Y390F mutation; however, 
when the turnover of dNTPs to dNMPs coupled to ttis 
nonprocessive assay was analyzed, an abnormally hi;th 
3 '-5' exonuciease activity was detected for the thr ae 
mutants, Y254F, Y390F and Y390S, being 62-foW, 16-foid 
and 22-fold higher, respectively, than that of the wt protein 
(Fig.4C). The fact that the 3'-5< exonuciease activity, 
assayed in the absence of polymerization, was normal f w 
the three mutants (not shown) indicates that the increased 
turnover observed at the DNA 3 '-terminus is probably 
reflecting a slow rate of translocation of the mutant <p'/9 
DNA polymerases along the DNA template. Therefor, 
and in agreement with the proposed structural mod<d, 
regions 1 and 2a are probably defining DNA polymerase 
regions that interact with the DNA. These results afcfO 
confirm and limit the proposed location of the exonuciease 
domain, supporting its physical separation from the poly- 
merase domain (Bern ad et al., 1989). 

Interestingly, the Y254F mutation (region 1) strongly 
affected the protem*primed initiation reaction by decreas- 
ing about 30-fold the formation of the terminal protean 
(p3)-dAMP initiation complex (F<g.4D). This resut, 
together with the neighbouring location of a specific regicu 
of about 50 aa residues preceding region 2a, only present n 
protein-primed DNA polymerases (unpublished result? K 
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Bg. 4. Effect of site-directed mutations in regions 1 and 2a of the <p29 DNA polymerase. Hie 029 DNA polymerase mutants Y254P (region IX Y39»*S 
and Y390F (regkm 2a> were obtained, expressed end purified as described by Bern ad et aX (1990a). (A) Noaproceufve DNA polymerization assay. T <ie 
nung-in reaction on EeoRl-difetxd 029 DNA (0.5 fig), assayed in the presence of either wt or mutant 029 DNA polymerase (1 ngX wai carried cat 
essentially as described by Bernail et al (1990b); the activity values, indicated as % of that of the wt protein, were; Y254F (80%); YJ90F (20%); Y39^S 
(100% X (B) Procevsive DNA po^ mentation assay. The replication of 029 DNA-protcin p3 (0.5 ug), assayed In the presence of either wt or mutant 029 
DNA polymerase (1 ng), was canied out essentially as described by Bemad et aL (1990b); the activity values, expressed as % of that of the wt proton 
were: Y254F (5%), Y390F(<2%); Y390S (60% X (C) Turnover coupled to the nonproccsaive DNA polymerization assay. Toe dAMP turnover «*s 
determined by pc^ycUiyLraenDiiM -cetralose thm-layer chromatography and further autoradiography of samples taken immediately after the Dh A 
polyraeruatioa reaction. Hie chrianatogram was developed with 0.15 M Li* formate pH 3.0, conditions in which the 5 '-dAMP migrates, whereas ttc 
DNA and the unincorporated dNrpa remain at the origin. Turnover was craantitated by counting the Csrenkov radlatioa corresponding to the S'-dA* HP 
spot; the activity values, relative lo that of the wt protein were: Y254F (62-fold); Y39QF (1 WbJd); Y390S (22-fbld). (D) Protdfi-f>rkned initiation assi y. 
Tbe formation of the pVdAMP eivalenl complex, assayed in the presence of 029 DNA-protem p3 (0,5 ng) as template, and either wt or mutant 029 
DNA polymerase (I ngX was carried out essentially as described by Bemad et al (1990b), and analyzed by 0.1% SDS-10% pc4y»eryUmide jet 
electrophoresis and autoradiogra;ihy; the activity values, expressed as % of that of the wt protein, were: Y2S4F (3%); Y390F (75%); Y390S (89} X 



suggests that region 1 is involved, in addition to DNA- 
binding, in interaction with ihe priming protctn. Takin g into 
account the proposed flexibility of the structure in the 
neighbourhood of region 1, his tempting to speculate that 
the strong interaction detect ed between the <p29 DNA poly- 
merase and terminal protein (Blanco eta]., 1987) is 



achieved by location of the latter inside the cleft and forth sr 
stabilized by interaction with a slightly 'adapted* flexible 
domain. 

(iv) Metal binding. Recent crystallographic studies if 
Poilk with dCTP have shown that Poll residues Gin 708 ai d 
Glu 710 (region 3) are involved in the interaction with i n 
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associated M g 2 * ion (L. Beese, J. Friedman and T. Steitz, 
personal communication). Interestingly, the corresponding 
region of g-like DNA polymerases contains the highly con- 
served ■YGDTDS* motif, trroposed to be involved in metal 
binding and catalysis (Argoii 1988; Bernad etal., 1990b). 
Site-directed mutagenesis in the <p29 DNA polymerase 
indicated that the Thr and second Asp of the motif seem to 
be the most critical residua for both the protein-primed 
initiation and polymerization function of the <f>29 DNA 
polymerase (Bernad et ah, 1990b). In agreement with their 
relative functional impoifcince, the first Asp of the 
'YGDTDS' motif corresponds to the relatively conserved 
Gin 708 of Poll (indicated by a star in Figs. 2 and 3), whereas 
the critical second Asp corresponds to Glu 7l ° of Poll (indi- 
cated by a star in Figs. 2 and 3), invariably conserved in 
group A DNA polymerases. The Involvement of region 3 of 
Poll in metal binding dees not support the alignment 
reported by Delarue et al. '1990), in which no Gm, Glu or 
Asp residues would be present in the corresponding posi- 
tions of of-likc DNA polymerases. Furthermore, the change 
of <f>29 DNA polymerase residue Cys 454 into the consensus 
Gly of the motif, produced an abnormal behaviour in the 
usage of activating metal iens by the $29 DNA polymerase 
(Bernad et al., 1990b). The functional importance of 
region 3 for polymerase activity has been also demonstrated 
in the HSV DNA polymerase; in this case* single changes 
in the 'GDTD' sequence destroyed the polymerization 
activity (Dorsky and Crumpacker, 1990). Therefore, all 
these data strongly suggest that the aspartates contained in 
the 'YGDTDS* motif play a direct role in metal binding in 
DNA polymerases belong ng to groups B, C and D. 

In Pollk, regions corresponds with two antiparallel 
^-strands (12 and 13) comected by a turn. One of the 
residues located in this turn is Asp 882 (indicated by a star 
in Figs. 2 and 3), shown to be directly interacting with the 
polymerization metal ion, in addition to Gin 708 and Glu 710 
residues described before (L. Beese, J. Friedman and T,A. 
Steitz, personal communication). The catalytic significance 
of this residue is consistent with the high conservation of 
this Asp (or Glu) in most of the DNA polymerases com- 
pared (Fig. 2). In the case of group D DNA polymerases, 
the conservation of region 5 is significantly lower in com- 
parison with the other gitwps (Table I). Therefore, the 
higher divergence of this region in group D that could be 
related to its proximity to the very C terminus, leads to take 
with caution, in the case of the protein-primed DNA poly- 
merases, the functional significance inferred for this region, 

(v) dNTP binding. Joyce iind S teitz ( 1987) have suggested 
that the dNTP-binding she of Pollk lies between the 
C-tenninal end of the O-hettx, the N-tenninal end of the 
Q-heKx and the bed formed by strands 7, 8, 12 and 13 
(Pig. 3). This location is consistent with photoaiUdty label- 
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ling studies showing a direct role of Poll Tyr 766 (Rush sad 
Konigsberg, 1990), located at the C-terminal end of O-heiix, 
and Poll His 861 (Pandey et al., 1987), which is located in the 
turn between 0-strands 12 and 13. Additionally, pyridorsai 
phosphate, which binds competitively to the dNTP site, 
reacts with Poll Lys 75 *, also in the O-helix (Basu aad 
Modak, 1987). According to the alignment shown in Fig. 2, 
region 4 corresponds to the O-helix in Poll. In agreemtnt 
with their functional importance, the residues correspond- 
ing to Poll Lys 73 * and Tyr 766 (indicated by stars in Figs, 2 
and 3) are invariant in the DNA polymerases belonging lo 
group A. In the case of groups B, C and D, the corrcspor 
ing region also contains highly conserved Lys and Tyr 
residues (boxed in Fig. 2) which could be the functional 
counterparts of Lys 73B and Tyr 766 of Poll. In agreement 
with that, substitution of the corresponding Lys 49 * residue 
in region 4 of the £29 DNA polymerase into Thr or At% 
(Fig. 2) completely destroyed the polymerization activity of 
the enzyme (unpublished results). Secondary structure 
analysis predicts, in all the DNA polymerases compared 
including Poll, that region 4 is structured as an a-helix at 
its N- terminal two thirds, whereas the C-terminal third t as 
a clear hydrophobic character (not shown). These data 
suggest that a hydrophobic environment is probably impor- 
tant to favour dNTP binding. In region 5, and also in agree- 
ment with its functional importance, an His residue corre- 
sponding to Poll His 881 (indicated by a star in Figs, 2 
and 3) is invariably present in group A DNA polymerases ; 
however, with the exception of the EBV DNA polymerase, 
this residue is not conserved in groups B, C and D. Accord- 
ing to the three-dimensional model, the spatial proximity of 
the residues involved in both metal and dNTP binding 
(Fig. 3X agrees with the functional requirement of the dbT^T 
substrates in their metal-quelated form. This intimal rela- 
tionship is also reflected by the conservation of the distance 
between regions 3, 4 and 5 in most of the evolutional tly 
distant DNA polymerases compared (Figs. 2 and 5). 

Recently, a 50- aa peptide encompassing the Poll O-hctix 
was shown to bind both at substrates and duplex DNA 
(Mullen et al.* 1989). This fact, and the structural location 
of O-helix (forming one side of the cleft), leads us to specu- 
late that region 4 could have a dual role in both dNTP a ad 
DNA binding, being important to adjust the incoming at at 
the proper distance with respect to the primer-terminus aid 
template-nt. Thus, the coordinated binding of both sub- 
strates, DNA and dNTPs, could provide an importrut 
contribution to the insertion fidelity. In agreement with this 
idea, it is worth noting that the phenotype of one of ine 
classical T4 mutator mutants (tsL88), having a reduced 
specificity in inserting correct nt (Hershfield, 1973), is due 
to the single change Gly 694 -* Ser; as shown in Fig. 2 ttfs 
position lies in region 4 of T4 DNA polymerase. 
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Fig. 5. Modular organization ofereymatic activities in DNA-dependcot DNA polymerases. Each DNA polymerase is represented with the N tcnnirus 
at the lefthand skfe. Open squares indicate the three trighjy conserved N-tenninal regions Exol, Exoll and ExoITJ (shown in Fig. I), proposed to Com 
a general 3'-5' exoonctease-actr'e she (Bemad et al, 1989} Blackened squares indicate the six highly conserved C-terminal regions shown in Fig 2. 
Regions 1, 2a and 2b are proposed to be mainly involved in DNA binding, whereas regions 3, 4 and 5 are proposed to be mainly involved in metal a«d 
dNTP binding. The ExoUl segtm nt was considered as an alignment axis for the N-terminal portion of each DNA polymerase. To reflect the variation 
in length between regions I and 2, and the conservation of the distance bet w e en regions 3. 4 and S, regions 1 and 3 were considered as two indcpendMU 
alignment axes for the C-terxninal portion of the DNA polymerases compared. Specific insertions in protein-primed DNA polymerases arc indicated by 
arrows in the «*29 DNA polymerase DNA polymerase nomenclature Is given m section a. A scale bar Is given at the lower left 



(d) Modular organization in DNA-dependent DNA poly- 
merases 

As shown in Fig. 5, the presence of the linearly arranged 
conserved region a Exol, llxoll and ExoIII (3'-5' exo- 
nuelease), and regions 1, 2a, 2b, 3, 4 and S (DNA polymeri- 
zation X in most of the DNA polymerases belonging to any 
of the four groups, suggest* that these two domains corre- 
spond to an evolutionary conserved Klenow-like DNA 
polymerase 'core*. As shown in Fig. 5, Spn and Taq DNA 
polymerases do not con tain the Exol and Exoll regions, in 
agreement with the fact that they have very low, if any, 
3'-5' exonuelease activity (Lopez etal.,. 1989; Lawyer 
et al., 1989). Other DNA polymerases contain N-tenninal 
and C-terminal segments cut of the proposed 3 '-5' exo- 
nuelease and polymerase domains. In the case of Poll, the 
N terminus contains a 5'-3' exonuelease activity which can 
be structurally and function; ilLy separated from the Pollk by 
mild proteolysis (Lehman and Ghien, 1973 ; Jacobsen et aL, 
1974; Joyce et al., 1985). hi Spn and and Taq DNA pory- 
merases, the first 274 aa from the N-terminus have evident 
homology (36% and 37% identical residues, respectively) 
with the 5'-3' exonuelease domain of Poll (Lopez etal., 



1989; Lawyer et al., 1989). In agreement with these struc- 
tural data, the purified pneumococcal enzyme exhibts 
5'-3' exonuelease activity (L6pez etal., 1989), Recent y, 
an RNase H activity, which may have a similar role to thai 
of a 5' -3' exonuelease for removing RNA primers, has 
been reported to be an integral part of the HSV DNA 
polymerase (Crute and Lehman, 1989). Based on aa se- 
quence comparisons, this activity has been proposed to 
reside in the N terminus of HSV and also in VZV, EBV ai id 
HGMV DNA polymerases. In agreement with that, it his 
been reported that this RNase H was not only found to '»e 
associated to the 136-kDa HSV DNA polymerase, but al io 
to a 30-kDa polypeptide that may be a proteolytic fragment 
analogous to the 'small' fragment (containing the 5'- • ' 
exonuelease) of Poll (Crute and Lehman, 1989). The sice 
of that polypeptide fits in the N-tenninal region of HSV 
DNA polymerase located upstream from the ¥-5' ex>» 
nuclease domain (Fig. 5). In the case of T7 and T5 DNA 
polymerases and supporting the existence of structural aid 
functional 'modules' for the different catalytic activities if 
these enzymes, a separate 5'-3' exonuelease has bom 
reported to be homologous to that associated to Poll (Lop nz 
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et aL, 1989; Leavitt and Ito, 1989). Whether the most N-ter- 
minal and C-termmal portions of other DNA polymerases 
contain other enzymatic activities or have some other 
specific functions, such as interaction with accessory pro- 
teins, remains to be determined. 

(e) Conclnsloiis 

The structural and functional analyses reported in this 
paper allow us to propose that the different DNA poly- 
merases compared share a similar overall structure, as that 
of the Pollk ; nevertheles i, significant differences as the 
presence of specific regions or aa motifs in the proposed 
DNA binding, metal-binding and dNTP-binding sites, 
probably confer their individual characteristics of proces- 
sjvity and insertion fidelity* 

Interestingly, in the case of the £29 DNA polymerase, 
one of the smallest DNA polymerases having 3'-5' exo- 
nuclease activity, RNA primers arc not required for a prim- 
ing protein-initiated mecfcinism of continuous DNA repli- 
cation and this enzyme is, by itself, highly processive and 
able to produce strand displacement in the absence of any 
accessory protein (Blanco et aL, 1989). The properties are 
likely related with the fad. that its size is restricted to the 
proposed 'Klenow-like core'. On the other hand, the struc- 
tural comparison of .prolein-primed DNA polymerases 
(group D) with those belonging to the other three groups 
indicates that the specific protein-priming function does not 
correspond to any additional DNA polymerase structural 
"module*. On the contrail', the presence of specific inser- 
tions flanking region 2a (indicated with arrows in Ftg. 5) 
suggests that the ability to use a protein as a primer has been 
acquired by modification c f the DNA polymerization 'core' 
of a primordial DNA polymerase. 

Awaiting the resolution of the three-dimensional struc- 
ture of new DNA polymerases, the aa sequence similarities 
reported in this paper lead to testable predictions of critical 
residues for both structure and function in DNA-dependent 
DNA polymerases. 
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